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ON  THE  LIMITING  FORM  OF  THE  "CONTAGIOUS" 

BINOMIAL  DISTRIBUTION  AND  ITS  APPLICATION 

IN  STOCHASTIC  MODELS  OF  CHOICE  BEHAVIOR 

David  B.  Montgomery 

1.   Introduction 

The  "contagious" binomial  distribution  is  a  discrete  probability 
distribution  which  arises  as  the  steady-state  distribution  of  the  system 
state  variable  in  certain  stochastic  choice  models.  This  paper  presents 
a  continuous  limiting  form  of  this  distribution.  This  continuous  limit 
is  of  theoretical  and  practical  interest  in  applications  of  this  general 
class  of  model. 

In  order  to  set  the  stage  for  the  limiting  distribution  derived 
below,  it  is  well  to  review  briefly  the  general  model,  two  major  classes 
of  applications,  and  the  reasons  for  an  interest  in  developing  this 
limiting  form. 

2.   The  General  Model 

In  the  general  formulation  the  basic  model  component  will  be  referred  to 
as  an  element.   The  total  behavior  of  the  elements  in  the  model  will  be 
termed  system  behavior.   In  the  discussion  of  the  applications  of  the  model 
presented  below,  the  terms  element  and  system  will  be  identified  with  the 
nomenclature  which  has  been  used  in  the  application  of  this  model  type  to 
particular  stochastic  choice  situations. 

Discussion  will  be  facilitated  by  first  considering  the  notation 
which  shall  be  used.   In  the  general  model  an  element  will  be  uniquely 
associated  with  one  of  two  states,  A  or  B.   The  remainder  of  the  notation 
is  as  follows: 
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N  -  the  total  number  of  elements  in  the  system. 

1  -  the  number  of  elements  out  of  the  total  of  N  elements  that  are  in 
state  A  at  any  particular  time.   Thus  i  is  a  random  variable  which 

represents  the  state  of  the  system.   Clearly,  0  £  i  g  N. 

3 
cr  -  the  Inherent  transition  intensity  or  propensity  for  an  element 

in  state  B  to  shift  to  state  A. 

P  -  the  inherent  transition  intensity  or  propensity  for  an  element  in 

state  A  to  shift  to  state  B. 

7  -  the  additive  incremental  influence  on  the  transition  intensity  of 

an  element  toward  the  opposite  state  exerted  by  each  element  in 

the  opposite  state. 

Thus  the  propensity  for  an  element  in  state  B  to  shift  to  state  A  is 

composed  of  two  parts;   an  inherent  propensity,  a,  and  an  additive  attractive 

influence  of  the  i  elements  already  in  state  A,  iy.      Similarly  the  propensity 

for  an  element  to  shift  from  state  A  to  state  B  is  p  +  (N-i)7,  since  there 

are  N-i  elements  currently  associated  with  state  B. 

At  the  level  of  the  elements  the  process  has  two  states.   That  is, 

each  element  is  in  either  state  A  or  state  B.   But  at  the  level  of  the 

system  there  are  N  +  1  states  corresponding  to  the  N  +  1  possible  values  of 

the  system  state  variable,  i.   The  model  gives  rise  to  the  following  system 

4 
of  differential  equations  on  the  system  state  probabilities   at  time  t  which 

are  denoted  by  p.(t): 
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dp^(t)  =  -  N  a  P^(t)  +  [p  +  (N  -1)7]  p^(t)      for  i  =  0  (1) 

~dt 

dPj^(t)  =  -  {(N  -  i)  (  a  +  1  7)  +  i  IP  +  (N  -  i)  7I)  Pi(t) 
"dt 

+  (N  -  i  +  1)  [a  +  (i  -  DylP^.^Ct) 

+  (i  +  1)  [p  +  (N  -  i  -  1)  7]  Pi+i(t)   for  0<  i<N 

dp^(t)  =  -  N  pp^(t)  +  [a  +  (N  -  1)7]  Pjj.i(t)     for  i  =  N 
~dt 

The  steady-state  distribution  for  the  state  of  the  system  may  be  found 

by  the  simultaneous  solution  of  the  N  +  1  equations  given  as  (1)  when 

dp.(t)  =  0  for  i  =  0,  1,  ...  N.  The  steady-state  derivation  also  makes  use 
""dt 

of  the  fact  that 

N 

Z       P.(t)  =  1 

i=0  ^ 

Coleman  [3,  p.  345]  presents  this  steady-state  distribution  as 

„^.   I»\  r(f^i)r(N^f-i)r(^)  (2) 

r(f>r<>'  +  ^)    r(f) 

Thus^  at  stochastic  equilibrium  the  distribution  of  i,  the  number  of  elements 
in  state  A,  is  given  by  (2).   The  concern  in  this  paper  is  to  derive  the 
distribution  of  X  =  i/N  as  N  goes  to  infinity.   That  is,  the  distribution  of 
the  proportion  of  elements  in  state  A  is  sought  as  the  number  of  elements 
goes  to  infinity. 
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3.  Applications 

There  have  been  two  major  classes  of  models  developed  in  these  general 
terms.   These  classes  are  considered  below  along  with  the  motivation  for 
seeking  the  limiting  distribution  in  each  case. 
Class  I  Change  and  Response  Uncertainty 

The  focus  in  this  class  of  applications  has  been  upon  modeling  the 
choice  between  two  alternatives  in  situations  where  a  respondent  chooses 
A  versus  B  with  probability  P(A)  on  any  given  choice  occasion  and  where  P(A) 
may  change  between  successive  choice  occasions.   Thus  the  model  has  been 
used  in  situations  where  choice  is  stochastically  determined  and  where  the 
probability  of  making  a  particular  choice  is  non-stationary.   Applications 
to  experimental  data  and  consumer  brand  choice  may  be  found  In 
([2],  [3],  Chpt.  13,  [7],  Chpt.  5). 

In  these  applications  the  elements  of  the  general  model  are  considered 
to  be  hypothetical  response  elements,  similar  in  nature  to  the  stimulus 
elements  of  stimulus  sampling  theory.   The  system  in  the  general  model  is 
now  the  individual  respondent.   At  any  time  t  an  individual's  probability 
of  making  response  A  versus  response  B  is  given  by  X  =  i/N,  the  proportion 
of  his  N  response  elements  which  are  currently  associated  with  response  A  -- 
that  is,  in  state  A.   Thus  in  this  case  (2)  represents  the  steady-state 
distribution  of  an  individual  respondent's  probability  of  making  response  A 
versus  response  B.   In  a  model  of  consumer  brand  choice  response  A  might 
be  a  purchase  of  the  brand  of  interest  and  response  B  a  purchase  of  some  other 
brand.   Note  that  even  at  stochastic  equilibrium,  an  individual's  probability 
of  making  response  A  may  change. 
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The  desirability  of  letting  the  number  of  response  elements  increase 
without  limit  in  this  case  is  clear.   A  model  which  allows  the  probability 
that  an  individual  will  make  response  A  to  take  on  any  value  between  zero 
and  one  is  more  satisfying  than  one  which  constrains  this  probability  to 
certain  discrete  values.  For  example,  a  model  which  considers  each  individual 
to  have  two  response  elements  implies  that  an  individual's  probability  of 
making  response  A  is  either  0,  1/2,  or  1.  A  model  which  allows  each 
respondent  to  have  an  unlimited  number  of  response  elements  allows  this 
probability,  X,  to  be  a  continuous  measure.   Hence,  the  motivation  for 
seeking  the  limit  of  (2)  in  this  case  is  to  develop  the  steady-state 
distribution  for  a  more  theoretically  satisfying  model  of  the  choice 
probabilities. 
Class  II  Reward  Models  of  Interpersonal  Influence 

Coleman  ([1],  [3],  pp.  343-53)  has  presented  models  of  the  process  of 
interpersonal  influence  where  the  behavior  of  others  influences  an 
individual  to  exhibit  similar  behavior.   In  these  applications  the  elements 
of  the  general  model  correspond  to  individuals  and  the  system  corresponds  to 
the  group.   The  parameters  a   and  p  correspond  to  the  inherent  propensities 
of  individuals  to  shift  from  activity  B  to  activity  A  and  from  A  to  B, 
respectively.   The  parameter  y   measures  the  influence  which  the  behavior 
of  other  individuals  has  over  the  behavior  of  any  given  individual, 
Coleman  has  utilized  this  form  of  the  model  to  analyze  the  impact  of  inter- 
personal influence  upon  voting  behavior  in  small  groups. 

The  motivation  for  developing  the  limiting  form  of  (2)  in  this  case  is 
one  of  convenience  rather  than  theoretical  desirability.   For  moderately 
large  groups,  say  groups  having  over  twenty-five  members,  (2)  will  have 
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N>25  and  will  be  computationally  inconvenient.   This  is  particularly  true 
if  estimates  of  the  parameters  a^  P^  ^i^d  7  """st  be  obtained  concurrently 
with  a  test  of  the  distributional  form.   If  the  limiting  form  of  (2)  can 
be  shown  to  be  a  well  known,  well  tabled  continuous  distribution,  then  the 
continuous  approximation  would  prove  very  useful  for  applications  involving 
relatively  large  numbers  of  individuals  per  group. 

4.   Limiting  Distribution 

In  this  section,  the  distribution  of  X  =  i/N  is  derived  as  the  number 
of  elements,  N,  goes  to  infinity.   In  order  to  determine  the  function  which 
will  yield  a  continuous  probability  density  function  for  X,  it  is  first 
necessary  to  consider  what  occurs  as  N  goes  to  infinity  in  such  a  way  that 
X  =  i/N  remains  constant.   The  latter  restriction  is  included  to  ensure 
that  the  cumulative  probability  p[X  <  C],  where  C  is  some  constant  between 
zero  and  one,  remains  constant  as  N  goes  to  infinity. 

For  N  finite,  the  discrete  mass  function  p(X  =  i/N)  for  X  may  be 
represented  by  a  histogram  such  as  that  presented  in  Figure  1  where 
h(X)  denotes  the  height  of  the  histogram  at  X. 
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FIGURE   1 
HISTOGRAM  OF  THE  MASS   FUNCTION  p(X) 
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Note  that  the  width  of  each  interval  is  1/N  and  the  area  of  the  rectangle 
about  X  is  the  probability  that  the  system  has  the  proportion  X  =  i/N  of 
its  N  elements  associated  with  response  A.   That  is^ 

p(x)  =  ^  h(x). 

Hence 

h(x)  =  Np(x). 
Suppose  now  that  in  the  interval  (0,1)  more  and  more  elements  are 
packed  in,  that  is,  N  becomes  large.   In  this  case  1/N  will  become  very 
small,  and  in  the  limit  the  h(x)  will  be  so  close  together  that  they  will 
trace  out  a  continuous  curve.   This  continuous  curve  is  the  continuous 
p.d.f.  of  X  =  i/N  when  N  -►  «> in  such  a  way  that  X  =  i/N  remains  constant. 
That  is,  where  C  indicates  that  X  =  i/N  remains  constant, 


lim  h{x)    =   lim  N   p(x)    -^   f(X) 
N  -»  oo  N-»  oo 

X=(i/N)^C      X=(i/N)=C 


(3) 
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Thus  (3)  gives  the  function  which  must  be  taken  to  the  limit  in  order  to 
obtain  the  probability  density  function  of  X,  f(X). 

For  any  fixed  N,  there  is  a  direct  equivalence  between  p(X)  in  (3) 
and  the  p  given  by  (2).   Clearly,  for  some  fixed  N,  the  event  X  =  i/N 
occurs  if,  and  only  if,  the  event  i  occurs.   Hence  for  fixed  N  the  events 
X  =  i/N  and  i  are  equivalent  and  consequently  they  have  the  same  probability 
of  occurrence.   Thus  p(X=i/N)  =  p, .  But  the  steady-state  distribution  of  i 
for  some  fixed  N  has  been  given  in  (2) .   Hence  (3)  may  be  written  as 

f (x)  =  lira  N  p.  .^s 

i/N=C 
=  lim 

t^Tum  r(f)    r(N+^)    r(^) 

A  result  on  the  limiting  behavior  of  gamma  functions  when  a  term 

in  the  gamma  argument  increases  without  limit  is  needed  if  the  limit  on  the 

right  hand  side  of  (4)  is  to  be  found.   The  following  result  is  established 

as  (A-1)  in  the  Appendix: 

lim         r(ci£  +  a)    =    lim  a       r  (cr)  (5) 

ct  -*  oo  a.  -*  oo 

where  a  may  be  conq)lex. 

Using  (5)  in  (4)  one  finds  that 


n/n\  r(7  +  i)r(N4.^  ■  i)  r(^) 
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f(X)  =  lim     Np.  (6) 

NX=i=NC   ^ 

=  lim   r(^-t-^)  /  r(N)  N°'/'' x°'^''  r(Nx)  N^^''  (1  -  x)^/''  r(N[i-x]) 
Nx=i   r(^)  rA  NX  r(NX)  n  (i  -  x)  r(N  [i  -  x])  ^P^''  n^^''  r(N) 


^  P(^±^)   x^^/^)  ■  1  (1  -  x)<P/^)  -  ^ 


r(f)  r(^) 

But  (6)  is  just  the  beta  distribution  with  mean^  a/ (a  +  P)^  and  variance, 

2  11 

aP7/(a  +  P)   (a  +  P  a  7) .    Thus  the  infinite  element,  continuous  counter- 
part of  the  "contagious"  binomial  distribution  is  the  beta  distribution  given 
in  (6). 

For  a  suggested  Class  I  use  of  this  limiting  result  in  estimating  the 
distribution  of  P(A)  across  a  population  of  respondents  all  having  the  same 
a,  p,  and  7  see  (  [7]  ,  pp.  98-102).  The  fact  that  the  beta  is  a  well 
tabulated  distribution  renders  this  limiting  result  useful  in  Class  II 
applications. 
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The  purpose  of  this  appendix  is  to  show  that 

lim    r(a  +  a)  =  lim  a  p^^^)  (A-1) 

a-»<»  a-*  CO 

In  applications   to  stochastic  choice  models   "a"  will  be  real,   but   the  result 

holds   even   for   "a"   complex. 

It  will  be  convenient   to   recall   Euler's   limit   formula        for     r(a) 

which   is 

r(a)   =  lim       rCa^a)    =  lira  glot (A-2) 

Oir*<x>  a-*oea(a  +  1)    (a  +  2)    ...    (a  +  a) 

Then 

a 
r(a,a)   =  g.g 


a(a  +  1)    (a  +  2)    ...    (a  +  a)  (A-3) 

(g  +  1)    (  a  ■ 

r(a  +  a  +  1) 


=  g^  -r(g  +  1)   (  a  -  i)   (a-2)   ...  (i) 


=  g^    r(g  +  1)    r(a) 

r(a  +  g  +  1) 
recalling   that     r(a)    =    (a   -    1)1 
Now  consider 
lim     g       r(g)        =   lim     g       r(g  +  1)    /  g (A-4) 

g-<30    r(g  +  a)       ouoo    rto  +  ^  +  1)  /  (g  +  a) 

=  lira     (g  +  a)      g^    r(g  +  1) 
ouoo        a  rto  +  ^  +  1) 

=  lim    g      r(g  +  1)       •      r(a) 
ouoa     rCo:  +  a  +  1)  pC^) 

=   1 lira       g       r(g  +  1)     r(a) 

r(a)     CU<x>       r(oc  +  a  +  1) 

=   1 lim         r<3^   a)   by   (A-3) 

r(a)      a^co 

=     r(a)    =   1     by   (A-2) 

r(a) 
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Then    (A-4)    is    equivalent        to 


lim  {r(a  +  a)  -  a  r(a)) 


which  in  turn  is  equivalent  to  (A-1). 


FOOTNOTES 

1.  Coleman  has  coined  the  term  "contagious  binomial"  to  describe  the 
distribution  considered  in  this  paper. 

2.  Coleman  [3^  p.  413]  expressed  the  need  for  this  limit  but  indicated 
that  he  had  been  unable  to  derive  it. 

3.  For  a  discussion  of  the  notion  of  transition  intensity  in  relation  to 
continuous  time  stochastic  processes  see  [5,  pp.  423-8], 

4.  The  model  is  a  special  case  of  the  general  birth-death  process  in 
which  the  population  has  an  upper  bound,  N,  and  the  potential  birth 
pool  is  constrained  to  N-i.   In  this  case  a  "birth"  is  regarded  as 

an  element  changing  from  state  B  to  state  A.   The  converse  represents 
a  "death". 

5.  An  alternative  parameterization  of  this  distribution  has  been  given 

as 

i-1  N-i-1 

n     (a  +  jc)  n    (1  -  a  +  jc) 

Pi  =  (N)  i^.2 id: 

\i/  N-1 

n   (1  +  jc) 
j=o 

in  Coleman  ([1],  p.  123,  [3],  p.  345)  where  a  =  3—  and  c  =  ^. 

6.  For  the  derivation  of  this  model  type  from  an  axiomatic  system 
analogous  to  those  used  in  stimulus  sampling  theory  see  ([7j,  Chpt.  2) 

7.  This  restriction  is  analogous  to  that  used  in  the  development  of  the 
poisson  limit  of  the  binomial  distribution.   See  [5], 

8.  See  [7]  for  a  proof  that  (6)  satisfies  the  steady-state  form  of  the 
Fokker-Plank  diffusion  equation  which  governs  the  process. 

9.  See  ([4],  Chpt.  9)  for  a  statement  of  this  result  without  proof. 

10.  See  ([4],  p.  209). 
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11.  See  ([8],  p.   216). 

12.  See  ([6],  p.   77). 
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